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Abstract 

To analyse the significance of tfie digits used for interval bounds, we 
clarify the philosophical presuppositions of various interval notations. We 
use information theory to determine the information content of the last 
digit of the numeral used to denote the interval's bounds. This leads to 
the notion of efficiency of a decimal digit: the actual value as percentage 
of the maximal value of its information content. By taking this efficiency 
into account, many presentations of intervals can be made more readable 
at the expense of negligible loss of information. 

1 Introduction 

Once upon a time, it was a matter of professional ethics among computers never 
to write a meaningless decimal. Since then computers have become machines 
and thereby lost any form of ethics, professional or otherwise. The human 
computers of yore were helped in their ethical behaviour by the fact that it took 
effort to write spurious decimals. Now the situation is reversed: the lazy way 
is to use the default precision of the I/O library function. As a result, we are 
deluged with meaningless decimals. 

Of course interval arithmetic is not guilty of such negligence. After all, the 
very raison d'etre of the subject is to be explicit about the precision of computed 
results. Yet, even interval arithmetic is plagued by superfluous decimals, albeit 
in a more subtle way. In this note we first review the various interval notations. 
We argue in favour of a rarely used notation called "tail" , or "factored" , which 
has the advantage of avoiding the repetition of decimals that are necessarily the 
same. We analyse the information content of the remaining decimals. 

2 Philosophical implications of an interval nota- 
tion 

Several papers ^, |^ discussing interval notations have been published re- 
cently. The various notations have different implications, just as people have 
different reasons for being interested in interval arithmetic. 



For some, intervals are a way of denoting a fuzzy, or perhaps probabilis- 
tic, quantity. Others use intervals to give an indication of the extent to which 
rounding has introduced error in a computation. Here we assume an interpreta- 
tion of intervals that does not necessarily negate the above interpretations, but 
differs in the way it is made precise. We call it the set interpretation of interval 
arithmetic. 

The set interpretation According to the set interpretation, variables range 
over the real numbers. These reals are represented in computer memory as sets 
of reals. The constraint is that if variable x is represented by set S, we have 
X £ S. Thus the set interpretation differs from conventional numerical analysis 
in the absence of errors. It is either true or false that x belongs to S. 

The fact that S contains more than one real is not an error. In conventional 
numerical analysis, an error arises when, for example, a real variable x with 
value 0.1 is represented by a floating-point number /. An error arises because 
x — f is false. On the other hand, representing a; by S" is not an error if x £ S. 

Of course, the statement x d S provides only a limited amount of information 
about X. The larger S is, the less information. In the set interpretation of 
interval arithmetic we distinguish error, which is avoidable, from the inescapable 
fact that the amount of information yielded by a finite machine is finite. 

Consequences of the set interpretation Interval arithmetic is no excep- 
tion to the rule that finite machines can only give a finite amount of information. 
In interval arithmetic the sets of reals are limited to those that are easily repre- 
sentable: closed, connected sets of reals that have finite floating-point numbers 
as bounds, if they have a bound at all. Unbounded closed connected sets of 
reals use the infinities of the floating-point standard in the obvious way. Each 
of this finite set of sets of reals can be represented by a pair of floating-point 
numbers. It is also the case that for every set of reals, there exists a unique 
least floating-point interval containing it. 

This is the set interpretation of interval arithmetic. Its virtues include that 
it is familiar. In fact, many people are surprised to hear it given a name, as 
this is what they always thought intervals to be. Another virtue is that, if 
the set interpretation is followed up in all its consequences, it allows resolution 
of potential ambiguities in interval arithmetic, especially in interval division 
involving unbounded intervals, intervals containing zero, or intervals containing 
nothing but zero ||]. 

3 Interval notations 

If one accepts the advantages of the set interpretation of interval arithmetic, 
then one prefers a notation for an interval that suggests a set. The traditional 
notation, exemplified by [1.233, 1.235] has this advantage. Although widely 
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used, it is not practical, as is apparent]^ from the statement that an unknown 
real x belongs to 

[+0.6180339887498946804, +0.6180339887498950136]. (1) 

The problem with this ubiquitous notation is that it is hard to separate two 
important pieces of information: where the interval is, and how wide it is. To 
remedy this defect, Hyvonen described a notation according to which one 
writes instead 

+ 0.61803398874989[46804, 50136]. (2) 

The situation is similar when we are annoyed by having to write 

0.61803398874989a; + 0.61803398874989y, 

which we prefer to have in factored form: 0.61803398874989(x + y). Hence we 
propose to refer to (Q) as factored notation for intervals^. The name is more 
than an analogy: in general, one factors with respect to a multiplicative infix 
operation, of which concatenation on strings is an example. 

In the example the bounds are in normalized scientific notation and have the 
same exponent. In general, factored notation converts an interval [a x 10^,6 x 
10''], with normalized numerals as bounds, first to [a,b x 10*"^] x 10^, where 
the upper bound is not necessarily normalized. When p ^ q, then this cannot 
be shortened by taking an initial string of common first decimals outside the 
brackets. It can only be shortened by limiting the precision of a and h, a topic 
we address later in the paper. 

Table |l] contains an overview of interval notations. Most of the table is 
adopted from Hyvonen ]^. In this overview we distinguish three categories: (a) 
those that suggest a set, (b) those that suggest a number degraded by an error, 
and (c) those that suggest a pure number. The Classic and Factored notations 
belong to category (a). Under category (b) we have added, in analogy with the 
Tilde notation, the Plus notation. This latter notation is useful in the improve- 
ment of the factored notation discussed later on in this paper. Category (c) is in 
the last line. Hyvonen used the name "Fortran notation" . The notation is ac- 
tually the "Single-number notation" for the Fortran implementation described 
in §. ^ 

The virtue of the notations in category (b) is that they make explicit that a 
numeral is not to be interpreted according to mathematical notation, by which 
we mean that 

drndm-l ■ ■ ■ dQ.d^i . . . (i_„ (3) 

denotes the number ^^_ndilO^ . Mathematical notation implies an infinite 
number of zeros after the last digit when n > 0. 

^ I'm not making this up; see page 122 of |^]. 

^ The notation has been occasionally used without comment in the literature; see for 
example M. Credit goes to Hyvonen, whose paper B was the first to appear in print that 
drew attention to it and named it. Independently i did so in |7|. Hyvonen called it "tail 
notation" . 
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Notation 


Interval value 


Name of notation 


[1.233,1.235] 


[1.233,1.235] 


Classic 


1.23[3,5] 


[1.233,1.235] 


Factored 


1.234 ±2 


[1.232,1.236] 


Range 


1.234- 


[1.2335,1.2345] 


Tilde 


1.234+ 


[1.234,1.235] 


Plus 


1.234+[-le-3,2e-3] 


[1.233,1.236] 


Error 


1.234* 


[1.233,1.235] 


Star 


1.234 


[1.233,1.235] 


Single- Numb cr 



Table 1: Overview of interval notations, adapted from Hyvoncn 



Mathematical notation is not the only way to interpret (g). For a long time 
physicists, chemists, and engineers have used the convention that a numeral has 
as meaning any number that rounds to the number denoted by the numeral dis- 
played. The coexistence of mathematical notation with the physics convention 
introduces an ambiguity that is often resolved by context. With intervals, the 
ambiguity becomes problematic, as we need numerals to denote the bounds of 
an interval in the classic notation. Are these to be interpreted according to 
mathematical notation, or according to the physics convention? It is implicit 
in most of the interval literature, and explicit in (^,0], that the numerals in 
the bounds of an interval are to be interpreted according to the mathematical 
notation. In this paper we follow that rule. 

We therefore propose to avoid category (c) and to give single-number an 
annotation to indicate that it does not have the usual mathematical meaning. 
This has been done by Hickey, who introduced [^ the Star notation of Table |l]. 

Difficulties of factored notation There are two problems with the classical 
notation. The first is the scanning problem: one needs to scan both bounds digit 
by digit to find the leftmost different digit. Only then does one have an idea 
of the width of the interval. The second problem, the problem of useless digits 
can also be found in (^i the width of the interval is specified by no fewer than 
five digits. Restricting oneself to four digits for this purpose will give almost as 
much information about x and that the difference is so small as not to be worth 
that fifth digit. As we will show below, the same holds almost always for all 
digits beyond the first two or three. 

Factored notation solves the scanning problem; the problem of useless digits 
remains. To solve it also, we need to study quantitatively the information 
content of the statement that an unknown real x is contained in an interval 
[a,b]. 
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4 Information theory 



According to Shannon's theory of information (see for example, among many 
textbooks, Q), observations can reduce the amount of uncertainty about the 
value of an unknown quantity. The amount of information yielded by an ob- 
servation is the decrease (if any) in the amount of uncertainty. Shannon argues 
that the amount of uncertainty is appropriately measured by the entropy of the 
probability distribution over the possible values. For a uniform distribution on 
a finite number of values, this reduces to the logarithm of the number of possible 
values. It can be shown that the entropy for a distribution over n outcomes is 
maximized by the uniform distribution over these outcomes. 

When there are two equally probable possible values, and if one would like 
this logarithm to come out at unity, one takes 2 as base of the logarithms and 
one calls the unit of information bit, for binary unit of information. Thus, the 
binary digits carry at most one bit of information. Similarly, if one works with 
decimal digits, then it is convenient to use 10 as the basis of the logarithms. 

Thus information theory determines for each number base the maximum 
amount of information that can be carried by a digit. Normally, if we don't 
know what a number is, and we are only given the first k digits of a numeral 
denoting that number, we have no idea what the next digit should be. That 
is, all possibilities in {0,1,2,3,4,5,6,7,8,9} are equally probable so that the 
uncertainty is logj^Q 10 = 1. As a decimal digit can only distinguish between ten 
possibilities, the efficiency of the {k + l)st digit is one. 

In the set interpretation of interval arithmetic, we have information of the 
form that a real x belongs to a set S. According to information theory, this 
represents an uncertainty equal to the entropy of the probability distribution 
over the elements of S. What distribution to assume? We are only interested in 
the large differences in information carried by the successive digits of factored 
notation. These are large compared to those due to the differences among 
plausible distributions. 

The fact that we are only interested in sets that are bounded intervals, simpli- 
fies matters considerably. Plausible distributions for bounded intervals include 
the uniform and the beta distributions. From now on, if wc know that x is in 
an interval /, we assume that the probability of x belonging to any subinterval 
of / only depends on the width of that subinterval and not on where in / this 
subinterval is located. This property is implied by the uniform distribution over 
/, and this is the distribution we assume for computation of the uncertainty in 
the statement x € I. This uncertainty is equal to — logj^Q w, in decimal units of 
information, where w is the width of /. 

5 Improvement of factored notation 

Factored notation solves the scanning problem. In this section we solve the 
remaining problem that typically many of the digits inside the brackets are 
useless. We do this by applying the formula found in Section to determine the 
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information content of the digits in factored notation. As factored notation is 
just an abbreviation of it, this holds for classical notation as well. 

We first consider a specific example in which we note a pattern of rapidly 
decreasing efficiency as more digits are added. We explain this phenomenon 
by a generally applicable formula, and use it to justify our recommendation to 
write no more than three decimal digits inside the brackets of factored notation. 

For the example, we randomly selected an interval under the constraints that 
both bounds have 15 digits, that the first five be the same, and that the interval 
be nonempty. Thus we came to consider the interval [a, b] that is, in factored 
notation, 

0.389015 [282749894, 960538227] (4) 

The information content is — logiQ{b — a), which is about 6.169 decimal units. If 
we have to represent the information that a real is confined to this interval, but 
are only allowed to use two digits inside the brackets, then this interval has to 
be 0.389015[28, 97]. This interval has information content of about 6.161. Thus 
we saved twice seven digits and lost an amount of information equal to 0.008 
decimal units. Note that an optimally used pair of decimal digits in factored 
notation carries 1.000 decimal units of information. 

This example suggests that two decimals inside the brackets already give 
almost all the information contained in the statement that x is in (^). That 
only two decimal digits inside the brackets are enough could be a misleading 
feature of this particular example. To investigate this possibility, we analyse 
the information content remaining for all possible ways of shortening (^. From 
this we will see that a pattern emerges. We show that the pattern is not a 
peculiarity of the example. Because the pattern almost always occurs, we give 
it a name: Rule of One Tenth. Before investigating this rule, we first need to 
be more precise about shortening the representation of an interval. 

Inflation Consider the statement that x G [a,b]. Let [a',b'] properly contain 
[a, 6]. Now it may be the case that x £ [a',b'] conveys almost as much infor- 
mation about X as X G [a, b] and yet [a', b'] requires fewer digits to write. Then 
[a', b'] is a more efficient representation than [a, b]. 

A more efficient representation such as [a', 6'] may be obtained by one or 
more applications of an operation we refer to as "inflation" . 

Definition 1 Let I be the representation of an interval of which the bounds 
have a finite number of decimals. The operation of inflation has as result the 
representation of the smallest interval containing I where each bound has one 
less decimal than the corresponding bound in /. 

In Table |^ we see some examples of inflation. Line is a typical case. Line 
1 illustrates that inflation may apply to intervals with an unequal number of 
decimals in the bounds. Line 2 is included to illustrate that inflation decreases 
the number of digits, so that the four-digit 0.9999 changes to the three-digit 
numeral 1.00. 
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line number 


before inflation 


after inflation 





0.123[456,789] 


0.123[45,79] 


1 


0.1[2345,34] 


0.1 [234, 4] 


2 


0.[1234, 9999] 


[0.123,1.00] 


3 


0.123[450,670] 


0.123[45,67] 


4 


0.123[499,501] 


0.123[49,51] 



Table 2: Examples of inflation. 





left boundary a 


right boundary b 


-logio(&-a) 


information 
loss 





0.389015 282749894 


0.389015 960538227 


6.168905911 




1 


0.389015 28274989 


0.389015 96053823 


6.168905907 


0.000000005 


2 


0.389015 2827498 


0.389015 9605383 


6.168905804 


0.000000103 


3 


0.389015 282749 


0.389015 960539 


6.168904843 


0.000000961 


4 


0.389015 28274 


0.389015 96054 


6.168898435 


0.000006407 


5 


0.389015 2827 


0.389015 9606 


6.168834366 


0.000064069 


6 


0.389015 282 


0.389015 961 


6.168130226 


0.000704140 


7 


0.389015 28 


0.389015 97 


6.161150909 


0.006979316 


8 


0.38901 52 


0.38901 60 


6.096910013 


0.064240896 


9 


0.38901 5 


0.38901 6 


6 


0.096910013 


10 


0.3890 1 


0.3890 2 


5 


1 


11 


0.389 


0.389 1 


4 


1 


12 


0.3 89 


0.3 90 


3 


1 


13 


0.3 8 


0.3 9 


2 


1 


14 


0. 3 


0. 4 


1 


1 


15 





1 





1 



Table 3: Intervals [a, b] containing an unknown real x. Information loss as the 
result of successive inflations. Given that x is in [0, 1], the information content 
oi X € [a, b] is — \og^Q{b — a). The loss due to inflation is in the last column. 



Let us now consider the change in interval width due to inflation. In line 3 
of Table ^ we see that it can be as little as zero. Line 4 shows that the width 
can increase by a factor of 10. In such a case, the digits saved by inflation carry 
as much information as is possible for a decimal digit. 

In Table ^ we see in the top line the bounds of interval (^) . Each next line 
shows the result of inflation applied to the previous line. Thus it is true that 
X is contained in each interval of the table. In the fourth column we see the 
information content of the statement that x belongs to the interval shown in 
that line. The last column shows the decrease in information compared to the 
line before. This decrease is to be compared to the information content of the 
omitted decimal, which is 1. Thus, the last column contains the efficiency of 
showing the last decimal in each bound in the line before. 

As one goes down the table, considering successively more succinct, yet 
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true statements about x, one sees an interesting transition about halfway. Of 
course something special has to happen at the point where factored notation 
is 0.38901[5, 6]. The next more succinct intervals are, successively, 0.3890[1,2], 
0.389[0, 1] and so on. In this range, the information decrease is 1, exactly the 
information content of the decimal digit saved. That is, the digits that are 
saved here are fully efficient. Factored notation is not as useful here as it was 
higher up in the table. In fact, it is redundant, as there is always a pair of 
successive single decimals inside the brackets. An ad-hoc notation in the style 
of tilde notation has a considerable advantage here. I adopted the one proposed 
by Hickey Q and called it "Plus" in Table 0. 

Let us now consider the most important part of Table ^. Suppose one con- 
siders shortening the interval in the top line to 0.389015[28, 97] and suppose 
one worries that too much information has been lost. The last column in line 
7 shows that the additional digits contained in line 8 add only about one tenth 
of the amount information contained in the last digits of line 7, which is al- 
ready pretty low at around one tenth of those in the line above that. One can 
summarize the last column above line 8 by the Rule of One Tenth: 

Each additional digit carries about one tenth of the information in 
the previous one. 

The rule holds quite well from line 8 upwards. If it would be exact, the last 
column in line 1 would be 6 * 10^^ instead of the 5 * 10~^ actually observed. Is 
this rule a fortuitous feature of this particular example? In the following, we 
will argue that it is not. 

The general case In Table |^ we see that the Rule of One Tenth only holds 
over many lines with considerable fluctuations from line to line. In fact, in 
Table ^ we saw that inflation can cause an increase in interval width of as little 
as a factor of one and as much as a factor of ten. These factors correspond to 
information losses of and 1, respectively. What can we say in general about 
interval widening due to inflation? 

We consider for the general case the interval shown digit by digit as 

0.a;i . ..Xj-i[yj . ..yj+k^ip,Zj . . . zj+k-iq], (5) 

where yj < Zj and k > 2. We ask whether the number of digits can be safely 
decreased by one application of the inflation operation. 

If p — q = 0, width does not increase, so inflation can be applied without 
any loss of information. The largest information loss occurs if p = 9 and (7=1, 
in which case the width increases by 18 x 10"-'"'^. Let us take 10"-'"'^+-'^ as a 
typical width increase, as it is a convenient value near midway these extremes. 

This increase should be compared with the width w of (^. The comparison 
is obscured by the large variation of w. It may be as little as 10"-'"'^ (see last 
line of Table ||) and nearly as much as 10^-'+^. In the case (||) is narrowest, 
inflation widens it typically by a factor ten. In that case p and q carry as 
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much information as is possible for a decimal digit. Perhaps all decimals should 
be kept. In the case is widest, inflation widens it by a negligible amount. 
Inflation is advisable. 

Apparently it does not help to consider the extreme values of w, as they 
lead to contradictory advice. So let us consider average values of w. We assume 
k >2 (we retain at least two digits inside the brackets) . If the average is in the 
order of 10"-' , then inflation causes negligible information loss. If the average 
width is near 10~^~^, then inflation causes the full amount of information loss, 
so this is the worst case. To simplify matters, we make the worst case worse 
and assume that w can range from to 10"-'+^. This is only a small change, 
as we are only interested in A: > 2, in which case the range from to 10"-'"'^ is 
negligible compared to the range from to 10^-'+^. 

It is simplest to assume that the probability distribution of w is not far from 
uniform between and 10"^"'"^. In that case, it will usually be the case that 

But one may prefer not to make assumptions about the probability distri- 
bution of w. Then one may accept the assumption that the digits between the 
brackets in (||) are independent random variables with a uniform distribution 
on {0, ... ,9} under the constraint that yj < Zj. The average width of (^) can 
then be expressed as 



9 

w = 

s=0 t=s+l 



PstWst (6) 



where pst is the probability of yj — s and Zj = t and Wst is the average width 
under the constraint that yj = s and Zj — t. For i between and 8, if yj = i, 
then Zj can be ?, . . . ,9. Under the assumption about the distributions of the 
digits involved, we have Pst = 1/ * = 1/45. 

We are interested in a lower bound for Wgt ■ Each width is bounded below by 
(t — s — l)* 10~^. Whatever the distribution, the average is also bounded below 
hy {t — s — 1) * 10^^ . Because this bound depends only ont — s, we rewrite (||) 
as 

9 9-d 

W = Pa,a+dWa^a+d 
d=l a=0 

Using Wa,a+d > {d — \) * lO^-' and Pst = 1/45 we have 



> (l/45)^(d-l)*10" 



d=l 

> (36/45) * 10~^' = (4/5) * 10~^' 

Moreover, w is bounded above by 10"-'+^. So it is reasonable to assume that w 
is in the order of 10~^ . 

Hence inflation widens an interval with a width of about 10^-' to one that has 
a width of about 10"-' + 10-3-''+^ = 10-i{l + 10^*^+1). Thus, the uncertainty 
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decreased by the last digit is in the order of log]^Q(l + 10"'^"'""'^), which is about 
IQ-k+i^ neglecting a factor of In 10. 

This is also the decrease in information gain for every additional digit inside 
the brackets in factored notation. This is also the Rule of Ten observed in Ta- 
ble ^ when averaging over many rows. We can expect that the third decimal in 
a factored notation only increases information by 0.01 of the potential informa- 
tion in a decimal digit, and is therefore of questionable value. We recommend 
factored notation with two decimals inside the brackets, while keeping in mind 
that the rule does not apply in rare cases such as line 4 in Table ^. 

6 Conclusions 

Interval methods are coming of age. When interval software was experimental, 
it didn't matter whether interval output was easy to read. Now that the main 
technical challenges have been overcome, and we at least know how to ensure 
that the floating-point bounds include all reals that are possible values of the 
variable concerned, we need to turn our attention to small, mundane matters, 
which include taking care of the convenience of users. Factored notation is an 
advance in this respect. However, without some attention to the number of 
digits inside the brackets, one runs the risk of specifying in maximum accuracy 
not the number under consideration, but the unavoidable lack of information 
about this number. 
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